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DETAILED ACTION 
Claim Rejections - 35 USC §102 

1 . The following is a quotation of the appropriate paragraphs of 35 
U.S.C. 102 that form the basis for the rejections under this section made in this 
Office action: 

A person shall be entitled to a patent unless - (b) the invention was patented or described in a 
printed publication in this or a foreign country or in public use or on sale in this country, more than 
one year prior to the date of application for patent in the United States. 

2. Claims 25-30, 39-41 , 43-47, and 49 are rejected under 35 U.S.C. 102(b) 
as being anticipated by Kimber et al. (US Patent No. 5598507). 

3. Regarding claim 39, Kimber et al. disclose an apparatus for processing a 
continuous audio stream containing human speech related to at least one 
particular transaction, comprising: a predeterminer which predetermines at least 
one speaker {element 212 in figure 12, initial training of speaker models HMMs); 
a detector which detects speaker changes in the audio stream {col. 1 1, In. 38- 
67); a recognizer which recognizes the predetermined speaker in the audio 
stream {col. 1 1, In. 38 to col: 12, In. 27); an indexer for indexing the audio stream 
dependent on a detected speaker change and a recognized predetermined 
speaker (ffg. 12 or col. 11, In. 38 to col. 12, In. 27). 

4. Regarding claims 25, 43, and 49, Kimber et al. disclose a method, 
apparatus, and program storage device readable by machine for processing a 
continuous audio stream containing human speech related to at least one 
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particular transaction, comprising the steps of: digitizing the continuous audio 
stream {ADC is inherently included in a computer system of figure 11); detecting 
a speaker change in the digitized audio stream {col. 11, In. 38-67); performing a 
speaker recognition if a speaker change is detected {col. 11, In. 38 to col. 12, In. 
27); indexing the audio stream with respect to the detected speaker change if a 
predetermined speaker is recognized {fig. 12 or col. 11, In. 38 to col. 12, In. 27). 

5. Regarding claims 26 and 44, Kimber et al. further disclose a method and 
apparatus according to claims 25 and 39, comprising the further step of 
protocolling time information for detected speaker changes {col. 11, In. 21-37). 

6. Regarding claims 27 and 40, Kimber et al. further disclose a method an^ 
apparatus according to claims 25 and 31 , wherein the step of detecting a 
speaker change and/or the step of performing a speaker recognition is/are 
preceded by the further step of detecting non-speech boundaries between 
continuous speech segments (coA 12, In. 1-10, specifically elements 212 or 216 
in figure 12). 

7. Regarding claim 28, Kimber et al. further disclose a method according to 
claim 25, wherein the step of detecting a speaker change is accomplished by use 
of at least one characteristic audio feature, in particular features derived from the 
spectrum of the audio signal {col. 12, In. 1-20, spectral feature vectors to train 
HMM are derived from audio signal for comparison with stored models). 
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8. Regarding claim 29, Kimber et al. further disclose a method according to 
claim 25, wherein the step of performing a speaker recognition involves the 
particular steps of calculating a speaker signature from the audio stream and 
comparing the calculated speaker signature with at least one known speaker 
signature {col. 5, In. 10-27, spectral feature vectors used to train the HMM are 
speaker signatures). 

9. Regarding claim 30, Kimber et al. further disclose a method according to 
claim 25 for use in a speech recognition or voice control system comprising at 
least two speaker-specific speaker models and/or dictionaries, wherein 
interchanging between the at least two speaker-specific dictionaries dependent 
on the detected speaker change and the corresponding recognized speaker {col. 
11, In. 13 to col. 12, In. 20 and figure 9). 

10. Regarding claim 41 , Kimber et al. further disclose an apparatus according 
to claim 39, further comprising a scanner which automatically scans a continuous 
audio record, in particular a continuous audio stream recorded on a data or a 
signal carrier, and for detecting speaker changes in the continuous audio record 
{figure 11 or col. 11, In. 13-37). 
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1 1 . Regarding claim 45, Kimber et al. further disclose ah apparatus according 
to claim 39, comprising means for marking at least the beginning of a detected 
speech segment related to a predetermined speaker {col. 11, In. 21-37). 

12. Regarding claim 46, Kimber et al. further disclose an apparatus according 
to claim 39, comprising database, which stores speech signatures for at least two 
speakers {the operation of figure 12 stores initial training speaker models). 

1 3. Regarding claim 47, Kimber et al. disclose a speech recognition 
processing an incoming audio stream and having at least two speaker models 
and/or speaker-specific dictionaries, comprising: a detector which detects a 
speaker change in the incoming audio stream {col. 11, In. 38-67); a gathe^hich 
gathers speaker-specific information with corresponding speaker-specific 
information of at least one predetermined speaker thus recognizing the at least 
one predetermined speaker {col. 5, In. 10 to col. 10, In. 67, input audio signal is 
parameterized into feature vectors for comparing with the speaker templates); 
and an interchanger which interchanges between the at least two speaker- 
specific dictionaries dependent on the detected speaker change and the 
corresponding recognized speaker {figure 12, the system of figure 12 contains a 
number of trained speaker recognized models, each is compared with input 
models to determine a speaker match). 
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14. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for 
all obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or descrjbed 
as set forth in section 102 of this title, if the differences between the subject matter sought to 
be patented and the prior art are such that the subject matter as a whole would have been 
obvious at the time the invention was made to a person having ordinary skill in the art to which 
said subject matter pertains. Patentability shall not be negatived by the manner in which the 
invention was made. 

15. Claims 19-24, 31-38, 42, and 48 are rejected under 35 U.S.C. 103(a) as 
being unpatentable over Kimber et al. (US Patent No. 5598507) in view of 
Glickman et al. (US Patent No. 6076059). 

16. Regarding claim 31 , Kimber et al. disclose an apparatus for processing a 
continuous audio stream containing human speech related to at least one 
particular transaction, comprising: a predeterminer which predetermines at least 
one speaker {element 212 in figure 12, initial training ofspeal<er models HMMs); 
a detector which detects speaker changes in the audio stream {col. 11, In. 38- 
67); a recognizer which recognizes the predetermined speaker in the audio 
stream {col. 12, In. 1- 27). 

Kimber et al. fail to disclose an initiator which initiates transcription of at 
least part of the audio stream in case of a detected speaker change and a 
recognized predetermined speaker. However, Glickman et al. teach an initiator 
which initiates transcription of at least part of the audio stream in case of a 
detected speaker change and a recognized predetermined speaker {col. 5, In. 
30-67). 
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Since Kimber et al. and Gllckman et al. are analogous art because they 
are from the same field of endeavors, it would have been obvious to one of 
ordinary skill in the art at the time of invention to modify Kimber et al. by 
incorporating the teaching of Glickman et al. in order to provide automatic closed- 
caption using speaker-dependent models to enhance speech recognition 
accuracy. 

17. Regarding claims 19, 34, 42, and 48, Kimber et al. disclose a method, 
apparatus, and a program storage device readable by machine for processing a 
continuous audio stream containing human speech related to at least one 
particular transaction, comprising the steps of: digitizing the continuous audio 
stream {ADC is inherently included in a computer system of figure 11); detecting 
a speaker change in the digitized audio stream {col. 11, In. 38-67); performing a 
speaker recognition if a speaker change is detected {col. 12, In. 1- 27). 

Kimber et al. fail to disclose the step of transcribing at least part of the 
continuous audio stream if a predetermined speaker is recognized. However, 
Glickman et al. teach the step of transcribing at least part of the continuous audio 
stream if a predetermined speaker is recognized {col. 5, In. 30-67). 

Since Kimber et al. and Glickman et al. are analogous art because they 
are from the same field of endeavors, it would have been obvious to one of 
ordinary skill in the art at the time of invention to modify Kimber et al. by 
incorporating the teaching of Glickman et al. in order to provide automatic closed- 
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caption using speaker-dependent models to enhance speech recognition 
accuracy. 

18. Regarding claim 35, Kimber et al. disclose an apparatus for processing a 
continuous audio stream containing human speech related to at least one 
particular transaction, comprising the steps of: digitizing the continuous audio 
stream {ADC is inherently included in a computer system of figure 11); detecting 
a speaker change in the digitized audio stream {col. 11, In. 38-67); performing a 
speaker recognition if a speaker change is detected {col. 11, In. 38 to col. 12, In. 
27); indexing the audio stream with respect to the detected speaker change if a 
predetermined speaker is recognized {fig. 12 or col. 11, In. 38 to col. 12, In. 27). 

19. Regarding claims 20 and 36, Kimber et al. further disclose a method and 
apparatus according to claims 19 and 31, comprising the further step of 
protocolling time information for detected speaker changes {col. 11, In. 21-37). 

20. Regarding claims 21 and 32, Kimber et al. further disclose a method and 
apparatus according to claims 19 and 39, wherein the step of detecting a 
speaker change and/or the step of performing a speaker recognition is/are 
preceded by the further step of detecting non-speech boundaries between 
continuous speech segments {col. 12, In. 1-10, specifically elements 21 2 or 21 6 
in figure 12). 
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21 . Regarding claim 22, Kimber et al. further disclose a method according to 
claim 19, wherein the step of detecting a speaker change is accomplished by use 
of at least one characteristic audio feature, in particular features derived from the 
spectrum of the audio signal {col. 12, In. 1-20, spectral feature vectors to train 
HMM are derived from audio signal for comparison with stored models). 

22. Regarding claim 23, Kimber et al. further disclose a method according to 
claim19, wherein the step of performing a speaker recognition involves the 
particular steps of calculating a speaker signature from the audio stream and 
comparing the calculated speaker signature with at least one known speaker 
signature (col. 5, In. 10-27, spectral feature vectors used to train the HMM are 
speaker signatures). 

23. Regarding claim 24, Kimber et al. further disclose a method according to 
claim 19 for use in a speech recognition or voice control system comprising at 
least two speaker-specific speaker models and/or dictionaries, wherein 
interchanging between the at least two speaker-specific dictionaries dependent 
on the detected speaker change and the corresponding recognized speaker {col. 
11, In. 13 to col. 12, In. 20 and figure 9). 

24. Regarding claim 33, Kimber et al. further disclose an apparatus according 
to claim 31 , further comprising a scanner which automatically scans a continuous 
audio record, in particular a continuous audio stream recorded on a data or a 
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signal carrier, and for detecting speaker changes in the continuous audio record 
{figure 11 or col. 11, In. 13-37). 

25. Regarding claim 37, Kimber et al. further disclose an apparatus according 
to claim 31, comprising means for marking at least the beginning of a detected 
speech segment related to a predetermined speaker {col. 11, In. 21-37). 

26. Regarding claim 38, Kimber et al. further disclose an apparatus according 
to claim 31 , comprising database, which stores speech signatures for at least two 
speakers {the operation of figure 12 stores initial training speaker models). 

Conclusion 

Any inquiry concerning this communication or earlier communications from 
the examiner should be directed to Huyen Vo whose telephone number is 703- 
305-8665. The examiner can normally be reached on M-F, 9-5:30. 

If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Doris To can be reached on 703-305-4827. The fax 
phone number for the organization where this application or proceeding is 
assigned is 703-872-9306. 
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Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). 

Examiner Huyen X. Vo August 30, 2004 
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